library(estimatr)
difference_in_means(count ~ sensitive, data = df)
Gustavo Diaz
McMaster University
gustavodiaz.org
diazg2@mcmaster.ca
Slides: talks.gustavodiaz.org/nu1
Have you lied about having COVID symptoms?
Would you bribe a police officer to avoid a traffic ticket?
Have you been offered goods or favors for your vote?
Do you know anyone with ties to a militant organization?
Would you oppose a black family moving next door?
Would you allow Muslim immigrants to become citizens?
They are sensitive questions
We can only learn about them using surveys
But asking about them directly leads to misreporting
This form of measurement error is called sensitivity bias
Honesty appeals
Confidentiality protocols
Randomized response
Network scale-up
Endorsement experiments
List experiments
Honesty appeals
Confidentiality protocols
Randomized response
Network scale-up
Endorsement experiments
List experiments
Here is a list of things that some people have done.
Please listen to them and then tell me HOW MANY of them you have done in the past two years.
Do not tell me which ones. Just tell me HOW MANY:
Do not tell me which ones. Just tell me HOW MANY:
Do not tell me which ones. Just tell me HOW MANY:
\[ \text{Proportion(Voted yes)} =\\ \text{Mean(List with sensitive item)} -\\ \text{Mean(List without sensitive item)} \]
We get a prevalence rate estimate but we do not know how individual respondents voted!
Did you vote YES or NO on the Personhood Initiative, which appeared on the November 2011 Mississippi General Election Ballot?
List A
List B
Organization X (advocating for immigration reduction and measures against undocumented immigration)
Everyone sees it
Randomly appears in list A or B
Equivalent to two parallel list experiments
\[ \hat{\tau}_A = \text{Mean}(A_t) - \text{Mean}(A_c) \]
\[ \hat{\tau}_B = \text{Mean}(B_t) - \text{Mean}(B_c) \]
| List order | Sensitive item location |
|---|---|
| Fixed | Fixed |
| Randomized | Fixed |
| Fixed | Randomized |
| Randomized | Randomized |
| List order | Sensitive item location |
|---|---|
| Fixed | Fixed |
| Randomized | Fixed |
| Fixed | Randomized |
| Randomized | Randomized |
| List order | Sensitive item location |
|---|---|
| Fixed | Fixed |
| Randomized | Fixed |
| Fixed | Randomized |
| Randomized | Randomized |
| List order | Sensitive item location |
|---|---|
| Fixed | Fixed |
| Randomized | Fixed |
| Fixed | Randomized |
| Randomized | Randomized |
| List order | Sensitive item location |
|---|---|
| Fixed | Fixed |
| Randomized | Fixed |
| Fixed | Randomized |
| Randomized | Randomized |
Design effect (Blair and Imai 2012)
The inclusion of a sensitive item affects how survey participants respond to the baseline items within the list.
Carryover design effect
The inclusion of a sensitive item in one list affects how participants respond to the baseline items in the other list.
| Observed response | List 1 | List 2 | Difference |
|---|---|---|---|
| Baseline | 2 | 2 | 0 |
| Deflation | |||
| Sensitive first | 1 | 1 | 0 |
| Sensitive second | 2 | 1 | 1 |
| Inflation | |||
| Sensitive first | 3 | 3 | 0 |
| Sensitive second | 2 | 3 | -1 |
| Observed response | List 1 | List 2 | Difference |
|---|---|---|---|
| Baseline | 2 | 2 | 0 |
| Deflation | |||
| Sensitive first | 1 | 1 | 0 |
| Sensitive second | 2 | 1 | 1 |
| Inflation | |||
| Sensitive first | 3 | 3 | 0 |
| Sensitive second | 2 | 3 | -1 |
| Observed response | List 1 | List 2 | Difference |
|---|---|---|---|
| Baseline | 2 | 2 | 0 |
| Deflation | |||
| Sensitive first | 1 | 1 | 0 |
| Sensitive second | 2 | 1 | 1 |
| Inflation | |||
| Sensitive first | 3 | 3 | 0 |
| Sensitive second | 2 | 3 | -1 |
| Observed response | List 1 | List 2 | Difference |
|---|---|---|---|
| Baseline | 2 | 2 | 0 |
| Deflation | |||
| Sensitive first | 1 | 1 | 0 |
| Sensitive second | 2 | 1 | 1 |
| Inflation | |||
| Sensitive first | 3 | 3 | 0 |
| Sensitive second | 2 | 3 | -1 |
| Observed response | List 1 | List 2 | Difference |
|---|---|---|---|
| Baseline | 2 | 2 | 0 |
| Deflation | |||
| Sensitive first | 1 | 1 | 0 |
| Sensitive second | 2 | 1 | 1 |
| Inflation | |||
| Sensitive first | 3 | 3 | 0 |
| Sensitive second | 2 | 3 | -1 |
| Observed response | List 1 | List 2 | Difference |
|---|---|---|---|
| Baseline | 2 | 2 | 0 |
| Deflation | |||
| Sensitive first | 1 | 1 | 0 |
| Sensitive second | 2 | 1 | 1 |
| Inflation | |||
| Sensitive first | 3 | 3 | 0 |
| Sensitive second | 2 | 3 | -1 |
| Observed response | List 1 | List 2 | Difference |
|---|---|---|---|
| Baseline | 2 | 2 | 0 |
| Deflation | |||
| Sensitive first | 1 | 1 | 0 |
| Sensitive second | 2 | 1 | 1 |
| Inflation | |||
| Sensitive first | 3 | 3 | 0 |
| Sensitive second | 2 | 3 | -1 |
| Observed response | List 1 | List 2 | Difference |
|---|---|---|---|
| Baseline | 2 | 2 | 0 |
| Deflation | |||
| Sensitive first | 1 | 1 | 0 |
| Sensitive second | 2 | 1 | 1 |
| Inflation | |||
| Sensitive first | 3 | 3 | 0 |
| Sensitive second | 2 | 3 | -1 |
Goal: Detect asymmetric shift across treatment schedules
Difference-in-differences
Signed-rank test
Difference-in-differences
Signed-rank test
\[ \hat{\tau}_1 = \text{Mean}(\text{First list}_t) - \text{Mean}(\text{First list}_c) \]
\[ \hat{\tau}_2 = \text{Mean}(\text{Second list}_t) - \text{Mean}(\text{Second list}_c) \]
| Experiment | Statistic | p-value |
|---|---|---|
| Organization X (advocacy group) | 0.079 | 0.623 |
| Organization Y (border patrol) | -0.268 | 0.082 |
| Experiment | Statistic | p-value |
|---|---|---|
| Organization X (advocacy group) | 0.079 | 0.623 |
| Organization Y (border patrol) | -0.268 | 0.082 |
| Experiment | Statistic | p-value |
|---|---|---|
| Organization X (advocacy group) | 0.079 | 0.623 |
| Organization Y (border patrol) | -0.268 | 0.082 |
Facebook sample of Montevideo residents (N = 2688)
Four criminal governance strategies
Facebook sample of Montevideo residents (N = 2688)
Four criminal governance strategies
Things people have experienced in the last six months:
| List A | List B |
|---|---|
| Saw people doing sports | Saw people playing soccer |
| Visited friends | Chatted with friends |
| Activities by feminist groups | Activities by LGBTQ groups |
| Went to church | Went to charity events |
Things people have experienced in the last six months:
| List A | List B |
|---|---|
| Saw people doing sports | Saw people playing soccer |
| Visited friends | Chatted with friends |
| Activities by feminist groups | Activities by LGBTQ groups |
| Went to church | Went to charity events |
| Gangs threatening neighbors | Did not drink mate |
Things people have experienced in the last six months:
| List A | List B |
|---|---|
| Saw people doing sports | Saw people playing soccer |
| Visited friends | Chatted with friends |
| Activities by feminist groups | Activities by LGBTQ groups |
| Went to church | Went to charity events |
| Did not drink mate | Gangs threatening neighbors |
Placebo item more frequent than we anticipated
Offsets prevalence rates we would have observed
Solution: Reconstruct estimate bounds without placebo item
Challenge: Respondents may have noticed sensitive item and altered responses in unintended ways
Use tests to rule out strategic errors
| Sensitive item | Statistic | p-value |
|---|---|---|
| Threaten neighbors | 0.12 | 0.41 |
| Evict neighbors | 0.08 | 0.58 |
| Make donations | -0.24 | 0.16 |
| Offer work | -0.11 | 0.47 |
| Sensitive item | Statistic | p-value |
|---|---|---|
| Threaten neighbors | 0.12 | 0.41 |
| Evict neighbors | 0.08 | 0.58 |
| Make donations | -0.24 | 0.16 |
| Offer work | -0.11 | 0.47 |
Observed test statistics not unlikely under the null hypothesis of no carryover design effect
Placement
|
||
|---|---|---|
| List A | List B | |
| Sensitive item | ||
| Organization X | 545 | 525 |
| Organization Y | 537 | 543 |
\[ \widetilde{T} = \sum_{i=1}^N \text{sgn} \{(z_i - (1-z_i)) (Y_{i1} - Y_{i2})\} \times \tilde{q}_i \]
\[ \tilde{q}_i = {q_i-1 \choose m-1} \text{ for } q_i \geq m \]
\[ \tilde{q}_i = 0 \text{ for } q_i < m \]
\[ \text{with } 1 \leq m \leq N \]
Organization X
|
Organization Y
|
|||
|---|---|---|---|---|
| m | Statistic | p-value | Statistic | p-value |
| 2 | 8.356400e+04 | 1 | 3.571300e+04 | 1 |
| 5 | 3.809258e+12 | 1 | 3.323093e+12 | 1 |
| 10 | 1.791638e+23 | 1 | 1.825804e+23 | 1 |
| 50 | 1.439408e+86 | 1 | 2.533938e+86 | 1 |
Placebo I
|
Placebo II
|
|||||
|---|---|---|---|---|---|---|
| Item | Estimate | p | n | Estimate | p | n |
| Donate | 0.08 | 0.55 | 133 | -0.03 | 0.32 | 635 |
| Evict | 0.14 | 0.46 | 32 | 0.00 | 0.95 | 628 |
| Threaten | 0.24 | 0.08 | 102 | 0.02 | 0.44 | 641 |
| Work | 0.03 | 0.88 | 64 | 0.01 | 0.58 | 647 |
How many X do you know,
How many X do you know, who also know you,
How many X do you know, who also know you, with whom you have interacted in the last year
How many X do you know, who also know you, with whom you have interacted in the last year in person, by phone, or any other channel?
How many X do you know…
From Las Piedras
Male 25-29
Police officers
University students
Had a kid last year
Passed away last year
Married last year
Female 45-49
Public employees
Welfare card holders
Registered with party
With kids in public school
Did not vote in last election
Currently in jail
Recently unemployed
SENSITIVE ITEM